MMVar: Clustering Uncertain Objects via Minimization of the Variance of Cluster Mixture Models
نویسندگان
چکیده
A major issue in clustering uncertain objects is related to the poor efficiency of existing algorithms, which is mainly due to expensive computation of the distance between uncertain objects. This paper discusses how we addressed this issue through an original formulation of the problem of clustering uncertain objects based on the minimization of the variance of the mixture models that represent the clusters to be discovered. The proposed partitional clustering method, named MMVar, features high efficiency since it does not need to employ any distance measure for uncertain objects. Experiments have shown that MMVar turned out to be faster than prominent state-of-the-art algorithms for clustering uncertain objects, while achieving better average accuracy in terms of both external and internal cluster validity criteria.
منابع مشابه
DAGger: Clustering Correlated Uncertain Data
◮ It can compute exact and approximate probabilities with error guarantees for the clustering output. State-of-the-art techniques (e.g. UK-means, UKmedoids, MMVar): ◮ do not support the possible worlds semantics, ◮ lack support for correlations and assume probabilistic independence, ◮ use deterministic cluster medoids or expected means, and ◮ can only compute clustering based on expected distan...
متن کاملAn Efficient Uncertain Data Point Clustering Based On Probability–Maximization Algorithm
Clustering on uncertain data, one of the essential tasks in mining uncertain data, posts significant challenges on both modelling similarity between uncertain objects and developing efficient computational methods. The existing methods extend traditional partitioning clustering methods like k-means and density-based clustering methods like DBSCAN and Kullback-Leibler to uncertain data, thus rel...
متن کاملClustering of Fuzzy Data Sets Based on Particle Swarm Optimization With Fuzzy Cluster Centers
In current study, a particle swarm clustering method is suggested for clustering triangular fuzzy data. This clustering method can find fuzzy cluster centers in the proposed method, where fuzzy cluster centers contain more points from the corresponding cluster, the higher clustering accuracy. Also, triangular fuzzy numbers are utilized to demonstrate uncertain data. To compare triangular fuzzy ...
متن کاملRobust Method for E-Maximization and Hierarchical Clustering of Image Classification
We developed a new semi-supervised EM-like algorithm that is given the set of objects present in eachtraining image, but does not know which regions correspond to which objects. We have tested thealgorithm on a dataset of 860 hand-labeled color images using only color and texture features, and theresults show that our EM variant is able to break the symmetry in the initial solution. We compared...
متن کاملA Bayesian mixture model for classification of certain and uncertain data
There are different types of classification methods for classifying the certain data. All the time the value of the variables is not certain and they may belong to the interval that is called uncertain data. In recent years, by assuming the distribution of the uncertain data is normal, there are several estimation for the mean and variance of this distribution. In this paper, we co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013